Modeling Customer Lifetime With Dynamic Customer Feedback Information

New Perspectives in Business and Econometrics

Alexander Kulumbeg

Marketing Institutes MCA & RDS

Daniel Winkler

Introduction

Story

  • Contractual setting - curated shopping
  • Nation-wide apparel subscription box service provider
  • Female customers only


  • Monthly surprise boxes with clothes selected by a stylist (person)
  • Option for customer to approve or change something in the box
  • Once received - rating of each item by categories and with optional written feedback

Story II

<<<<<<< HEAD

Data

=======

Idea

  • Propensity to churn changes over time
  • Traditionally data is
    • hard to obtain
    • static / collected once
  • Written feedback contains (un)conscious pieces of information
  • Feedback changes over time
    • Stylist did a better/worse job than before
    • Clothes’ color/fit/cut/size/material is good/bad
    • Items did/didn’t adhere to the customer preferences stated in the quiz


  • What is hiding in the dynamic feedback (e.g., emotionality, eloquence, engagement…)?
  • How do these components influence the risk of customer attrition?
  • Can we identify other (latent) time-varying signals that affect customer lifetime?

Data

>>>>>>> 7c8c3dc959025dd54b2e774353471d5f9ea4dc61
  • Information on
    • Orders
    • Feedback
    • App usage
    • Customer journey
    • Style preferences
    • Stylist performance
    • Previews of Boxes
    • <<<<<<< HEAD
  • =======
  • ca. 57,000 unique customers
  • ca. 260,000 transactions
  • ca. 1,050,000 feedback items
  • >>>>>>> 7c8c3dc959025dd54b2e774353471d5f9ea4dc61
    • Distilled into a box-level dataframe
        <<<<<<< HEAD
      • One
    • a
    • c

    Idea

    Model

    =======
  • User demographics
  • User contract length
  • User lifetime spending
  • Box-level feedback variables
    • Word count
    • Sentiment
    • Eloquence
  • Model

    >>>>>>> 7c8c3dc959025dd54b2e774353471d5f9ea4dc61

    Causal Model

    Model Details

    A Bayesian Model for Time-Varying Parameters


    A piecewise exponential model for lifetimes.


    \[ \lambda(t|\boldsymbol z_i; t \in (s_{j-1}, s_j]) = \lambda_{ij} = \exp\left(\beta_{0j} + \sum_{k=1}^{K} z_{i k} \beta_{kj}\right) \]

    Piecewise Exponential Model

    Evolution of the \(\beta_{kj}\)’s

    As in Hemming and Shaw (2002), Gaussian random walks with initial state \(\beta_{k 0} \sim \mathcal{N}\left({\beta_{k}}, {\theta_{k}}\right)\) are considered: \[ \beta_{k j}=\beta_{k, j-1}+w_{j}, \quad w_{j} \sim \mathcal{N}\left(0, {\theta_{k}}\right). \]

    ’’

    Priors on Innovation Variances and Initial Value Means

    Triple gamma priors (Cadonna, Frühwirth-Schnatter, and Knaus 2020)1 are placed on both \(\beta_k\) and \(\theta_k\). Name stems from the fact that, when used for variances, it has a representation as a compound distribution consisting of three gamma distributions:

    \[ \begin{aligned} \theta_{k}\mid{\xi}_{k}^{2} \sim \mathcal{G}\left(\frac{1}{2}, \frac{1}{2 \xi_{k}^{2}}\right), \quad& \xi_{k}^{2}\mid a^{\xi}, \kappa_{k}^{2} \sim \mathcal{G}\left(a^{\xi}, \frac{a^{\xi} \kappa_{k}^{2}}{2}\right), \\ \kappa_{k}^{2} \mid c^{\xi}, \kappa_{B}^{2} &\sim \mathcal{G}\left(c^{\xi}, \frac{c^{\xi}}{\kappa_{B}^{2}}\right). \end{aligned} \]

    The first stage conditional prior implies the following first stage conditional prior on \(\sqrt \theta_k\): \[ \sqrt \theta_k | \xi_k^2\sim \mathcal{N}\left(0, \xi_k^2\right) \]

    Adding a Factor (?)

    To account for unobserved heterogeneity in the data, a grouped factor component can be added to the hazard rates. Let observation \(i\) belong to group \(g\), with \(g \in\{1, \ldots, G\} .\) Then the hazard rates look as follows: \[ \lambda_{i j}=\exp \left(\phi_{g} f_{j}+\beta_{0 j}+\sum_{k=1}^{K} z_{i k} \beta_{k j}\right), \] where \(f_{j}\) is allowed to vary over time according to a zero-mean stochastic volatility law of motion1: \[ \begin{aligned} f_{j} & \sim \mathcal{N}\left(0, e^{h_{j}}\right), \\ h_{j} \mid h_{j-1}, \phi_{f}, \sigma_{f}^{2} & \sim \mathcal{N}\left(\phi_{f} h_{j-1}, \sigma_{f}^{2}\right),\\ h_{0} & \sim \mathcal{N}\left(0, \sigma_{f}^{2} /\left(1-\phi_{f}^{2}\right)\right) . \end{aligned} \]

    Results I

    ’’

    Results II

    ’’

    Results III

    ’’

    Results IV

    ’’

    Conclusion

    Discussion

    Cadonna, Annalisa, Sylvia Frühwirth-Schnatter, and Peter Knaus. 2020. Triple the Gamma—A Unifying Shrinkage Prior for Variance and Variable Selection in Sparse State Space and TVP Models.” Econometrics 8 (2): 20.
    Gamerman, Dani. 1991. Dynamic Bayesian models for survival data.” Journal of the Royal Statistical Society: Series C (Applied Statistics) 40 (1): 63–79.
    Griffin, Jim, Phil Brown, et al. 2017. Hierarchical shrinkage priors for regression models.” Bayesian Analysis 12 (1): 135–59.
    Hemming, Karla, and Ewart Shaw. 2002. A parametric dynamic survival model applied to breast cancer survival times.” Journal of the Royal Statistical Society: Series C (Applied Statistics) 51 (4): 421–35.
    Hosszejni, Darjus, and Gregor Kastner. 2021. Modeling Univariate and Multivariate Stochastic Volatility in R with stochvol and factorstochvol.” Journal of Statistical Software 100: 1–34.
    Wagner, Helga. 2011. Bayesian estimation and stochastic model specification search for dynamic survival models.” Statistics and Computing 21 (2): 231–46.